This project focuses on utilising Meta’s Prophet package to forecast
atmospheric CO2 levels. Prophet is suited to handle time series data
with distinct seasonal patterns and long term trends, ideal for the
co2 dataset which will serve as the basis for this
analysis.
The key objectives of this project are to:
Load and prepare the co2 dataset for
analysis
Apply Prophet to forecast future atmospheric CO2 levels
Visualise and interpret the forecasted trends and seasonal patterns
Comment on results, including trends and limitations
prophetProphet is a forecasting tool by Meta that helps predict future trends in time series data, especially when there are patterns like seasonality and trends over time.
How prophet models trends and seasonalities
Why is Prophet appropriate for co2
dataset?
co2 datasetThe co2 dataset within R contains monthly atmospheric
CO2 concentrations measured at the Mauna Loa Observatory in Hawaii, from
1959 to 1997.
Atmospheric CO2 is a greenhouse gas contributing to global warming. Monitoring its levels over time helps understand climate change trends and seasonal environmental behaviours.
This data is ideal for time series analysis as it covers a long time period and long term trends and seasonal patterns can be studied.
The following R packages were used for this analysis:
prophet for time series forecastingzoo for working with time series datatidyverse for data manipulation and cleaningProphet requires that the data be in a dataframe format
with two specific columns, ds (date) and y (values to forecast) .
[1] 315.42 316.31 316.50 317.56 318.13 318.00
At the moment our co2 dataset is a time series
(ts) object, so it must first be converted into a
dataframe.
To convert the data into a data frame:
ds should contain the date valuesy should contain the numeric values to forecast (in
this case the CO2 measurements)And can be done with the code below:
co2_dataframe = data.frame( # Convert time index to year/month format
ds=zoo::as.yearmon(time(co2)), # Keep the CO2 measurements as numeric values
y=co2)
head(co2_dataframe) #verify that dataframe structure is correct ds y
1 Jan 1959 315.42
2 Feb 1959 316.31
3 Mar 1959 316.50
4 Apr 1959 317.56
5 May 1959 318.13
6 Jun 1959 318.00
Our dataset is now in the correct format for us to use Prophet, where:
ds is datesy are CO2 valuesThe Prophet model is fitted to the prepared dataset
co2_dataset using the default parameters
The following code fits the model:
Prophet is able to automatically detect trends and seasonality patterns:
We can use Prophet to learn from historical data and detect underlying patterns and prepare for forecasting future trends.
The function make_future_dataframe() is used to extend
the timeline for which CO2 levels are forecasted.
periods = 12 extends the forecast by 12 future
periodsfreq = "month" sets the frequency of data to
monthlySetting these conditions is important as Prophet needs to know how far into the future to forecast and also ensures the dates are consistent with the original data’s time frame.
The following code uses the predict() function to
generate forecasts for the future dates.
#predict future CO2 levels
predict_co2 = predict(prophet_model, forecast_co2)
#preview the future dataframe
head(predict_co2) ds trend additive_terms additive_terms_lower additive_terms_upper
1 1959-01-01 315.3626 -0.0775880 -0.0775880 -0.0775880
2 1959-02-01 315.4469 0.5946394 0.5946394 0.5946394
3 1959-03-01 315.5230 1.2325855 1.2325855 1.2325855
4 1959-04-01 315.6073 2.4609156 2.4609156 2.4609156
5 1959-05-01 315.6888 3.0206586 3.0206586 3.0206586
6 1959-06-01 315.7731 2.3515302 2.3515302 2.3515302
yearly yearly_lower yearly_upper multiplicative_terms
1 -0.0775880 -0.0775880 -0.0775880 0
2 0.5946394 0.5946394 0.5946394 0
3 1.2325855 1.2325855 1.2325855 0
4 2.4609156 2.4609156 2.4609156 0
5 3.0206586 3.0206586 3.0206586 0
6 2.3515302 2.3515302 2.3515302 0
multiplicative_terms_lower multiplicative_terms_upper yhat_lower yhat_upper
1 0 0 314.8181 315.7518
2 0 0 315.5872 316.5162
3 0 0 316.2924 317.2291
4 0 0 317.6156 318.5316
5 0 0 318.2636 319.1573
6 0 0 317.6606 318.5960
trend_lower trend_upper yhat
1 315.3626 315.3626 315.2850
2 315.4469 315.4469 316.0415
3 315.5230 315.5230 316.7556
4 315.6073 315.6073 318.0682
5 315.6888 315.6888 318.7095
6 315.7731 315.7731 318.1247
The important collumns for our analysis are:
The forecast plot shows the historical data and the predicted future trend and is plotted below.
Here is also an interactive plot of CO2 levels. Hover over the graph to explore exact CO2 levels at certain dates.
This is made by installing the package plotly
Overall trend
Seasonal Patterns
The two graphs below are a comparison of atmospheric CO2 concentration levels over the first and last 5 periods, 1959-1964 and 1993-1997, of the historic data. Each graph displays individual monthly CO2 measurements alongside a fitted linear regression line to highlight the overall trend in each period.
## Gradient first 5 years (1959-1964): 0.001657833
## Gradient last 5 years (1993-1997): 0.004221945
First 5 Years (1959-1964)
Last 5 Years (1993-1997)
The comparison clearly demonstrates that the rate of CO2 concentration growth has significantly increased over the decades. This aligns with broader concerns regarding the impact of industrialisation and human activities on CO2 emmisions that contribute to climate change.
Linear regression can be carried out to understand the overall growth rate of CO2 levels over time. The plot displays the historical CO2 concentration data (black dots) alongside a fitted linear regression line (in red).
The red regression line shows a clear upward trend, indicating that CO2 levels have been steadily increasing over the observed period
The scatter of points around the line suggests the presence of seasonal fluctuations that the linear model does not fully capture. But the trend line effectively summarises the overall increase in CO2 concentrations
Call:
lm(formula = y ~ time_numeric, data = co2_dataframe)
Residuals:
Min 1Q Median 3Q Max
-6.0413 -1.9469 0.0004 1.9106 6.5161
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.260e+02 1.514e-01 2153.2 <2e-16 ***
time_numeric 3.580e-03 2.944e-05 121.6 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2.619 on 466 degrees of freedom
Multiple R-squared: 0.9694, Adjusted R-squared: 0.9694
F-statistic: 1.478e+04 on 1 and 466 DF, p-value: < 2.2e-16
Statistical significance
Goodness of fit
r-squared = 0.9695, means that 96.95% of the variability in CO2 levels is explained by the linear model, indicating an excellent fit
adjusted r-squared = 0.9694, this value adjusts for the number of predictors in the model and is nearly identical to the regular r-squared, confirming the model’s validity
residual Standard Error = 2.618, this indicates the average deviation of the actual CO2 levels from the fitted values. Lower values indicate a better fit, the realatively high values here is most likely due to seasonal trends
F-statistic and P-value
Coefficients
Limitations of linear model
This graph helps to highlight which months typically have higher or lower CO2 levels. If a clear pattern emerges, it suggests a strong seasonal effect.
Interpretation of year over year CO2 levels by month graph
We can also explore the rate of change in CO2 levels month over month.
Interpretation of the monthly rate of change in CO2 levels graph
Prophet successfully captured the overall trend of rising CO2 levels, along with the consistent seasonal cycles seen throughout the dataset. The model’s forecasted values extended the historical trend into the future while maintaining the identified seasonal characteristics.
Min. 1st Qu. Median Mean 3rd Qu. Max.
313.2 323.5 335.2 337.1 350.3 366.8
[1] "1959-01-01" "1997-12-01"
[1] 337.0535
[1] 14.96622
[1] 0
The summary statistics for CO2 levels from historical data show:
These statistics confirm a consistent upward trend in CO2 levels, with regular seasonal variations, as assumed with Prophet
Min. 1st Qu. Median Mean 3rd Qu. Max.
312.7 323.7 336.0 337.8 351.5 367.8
[1] "1959-01-01 GMT" "1998-12-01 GMT"
[1] 337.7525
[1] 15.40796
Increased CO2 levels
Seasonal Patterns
Greater Variability
The analysis using Prophet showed that atmospheric CO2 levels have consistently increased from 1959 to 1997, with clear seasonal patterns and a strong upward long term trend. Prophet’s forecast suggests that if current patterns continue, CO2 levels will keep rising steadily in the coming years.
Long Term Trend:
The model identified a continuous rise in CO2 concentrations, indicating that emissions have been increasing over time. If this trend persists, it could contribute to worsening climate effects such as global warming, rising sea levels and extreme weather conditions.
Seasonality: The analysis also confirmed strong yearly seasonal patterns, where CO2 levels tend to rise and fall within each year. This reflects natural processes like photosynthesis, which absorbs more CO2 during certain months. However while these natural cycles remain consistent, they don’t contribute to the long term rise in CO2 levels.
The Prophet model effectively captured both the long term trend and seasonal variations in CO2 levels. It suggests that without significant intervention, CO2 concentrations will continue to rise contributing to climate related challenges.
This analysis highlights the importance of long term strategies to reduce emissions. Although natural cycles help balance CO2 in the short term, they aren’t enough to counteract the steady upward trend. Reducing emissions through policy, technology and behavioural changes is critical to avoiding severe environmental impacts.
Overall, Prophet’s forecast emphasises the need for ongoing monitoring and action on human and environmental decisions as well.
This quote from the Lorax serves as a reminder that addressing environmental challenges like CO2 emissions requires collective care and action.